Structural Equation Modelling in R

Author

Martin Schweinberger

Introduction

This tutorial introduces Structural Equation Modelling (SEM) — a powerful and flexible family of multivariate statistical techniques that allows researchers to simultaneously model multiple relationships among variables, account for measurement error, and test theories about constructs that cannot be directly observed. Where simple linear regression models a single outcome from one or more predictors, SEM can model entire systems of relationships, including situations where the same variable acts as both a predictor and an outcome, and where some of the most important variables in a theory are not directly measurable at all.

SEM is particularly well suited to the language sciences. Much of what linguists and applied linguists care about — language anxiety, motivation, metalinguistic awareness, communicative competence, reading ability — cannot be captured in a single measurement. These are latent constructs: theoretical entities that we infer indirectly from a set of observable indicators such as questionnaire items or test scores. SEM provides a principled framework for doing exactly this, and then for examining how these latent constructs relate to one another and to observable outcomes.

This tutorial is aimed at beginners with no prior exposure to SEM. You do not need to have studied factor analysis or path analysis before, though familiarity with basic regression is helpful. The goal is to build conceptual understanding from the ground up and to equip you with the practical skills to fit, evaluate, and report SEM models in R using the lavaan package.


What This Tutorial Covers
  1. Conceptual foundations — latent vs. observed variables, path diagrams, and the logic of SEM
  2. Confirmatory Factor Analysis (CFA) — specifying and evaluating measurement models
  3. Model fit assessment — CFI, TLI, RMSEA, SRMR, and how to interpret them
  4. Full SEM — combining the measurement model with structural (directional) paths
  5. Mediation analysis — testing indirect effects within an SEM framework
  6. Model comparison and modification — nested model tests and modification indices
  7. Reporting standards — model paragraphs and conventions for SEM results

Preparation and Session Set-up

This tutorial requires several R packages. If you have not yet installed them, run the code below. This only needs to be done once.

Code
# install packages
install.packages("lavaan")
install.packages("semPlot")
install.packages("semTools")
install.packages("psych")
install.packages("dplyr")
install.packages("ggplot2")
install.packages("tidyr")
install.packages("flextable")
install.packages("checkdown")

Once installed, load the packages:

Code
# load packages
library(lavaan)      # SEM and CFA estimation
library(semPlot)     # path diagram visualisation
library(semTools)    # reliability and model comparison tools
library(psych)       # descriptive statistics and correlation matrices
library(dplyr)       # data manipulation
library(ggplot2)     # data visualisation
library(tidyr)       # data reshaping
library(flextable)   # formatted tables
library(checkdown)   # interactive quiz questions

The Dataset

Throughout this tutorial we use a simulated dataset inspired by research on second-language (L2) writing. The data represent 300 university students who completed a battery of questionnaire scales and an academic writing task. The dataset includes:

  • Language Anxiety (anx1anx3): three Likert-scale items measuring the degree to which students feel anxious when writing in their L2 (higher = more anxious)
  • Writing Self-Efficacy (eff1eff3): three items measuring students’ confidence in their L2 writing ability (higher = greater self-efficacy)
  • Motivation (mot1mot3): three items measuring students’ intrinsic motivation to improve their L2 writing (higher = more motivated)
  • Writing Score (writing_score): a holistic score (0–100) assigned by trained raters to an in-class academic writing task

Because the data are simulated in R, no external file is needed — you can reproduce the entire analysis from the code below.

Code
set.seed(42)
n <- 300

# --- Latent variable scores ---
anxiety  <- rnorm(n, 0, 1)
efficacy <- rnorm(n, 0, 1)
motivat  <- 0.55 * efficacy + rnorm(n, 0, sqrt(1 - 0.55^2))

# --- Observed indicators (each = loading * latent + unique error) ---
anx1 <- 0.78 * anxiety  + rnorm(n, 0, sqrt(1 - 0.78^2))
anx2 <- 0.72 * anxiety  + rnorm(n, 0, sqrt(1 - 0.72^2))
anx3 <- 0.80 * anxiety  + rnorm(n, 0, sqrt(1 - 0.80^2))

eff1 <- 0.80 * efficacy + rnorm(n, 0, sqrt(1 - 0.80^2))
eff2 <- 0.74 * efficacy + rnorm(n, 0, sqrt(1 - 0.74^2))
eff3 <- 0.77 * efficacy + rnorm(n, 0, sqrt(1 - 0.77^2))

mot1 <- 0.73 * motivat  + rnorm(n, 0, sqrt(1 - 0.73^2))
mot2 <- 0.76 * motivat  + rnorm(n, 0, sqrt(1 - 0.76^2))
mot3 <- 0.70 * motivat  + rnorm(n, 0, sqrt(1 - 0.70^2))

# --- Outcome: Writing Score (influenced by efficacy, anxiety, motivation) ---
writing_score <- 55 + 10 * efficacy - 6 * anxiety + 4 * motivat + rnorm(n, 0, 6)
writing_score <- round(pmin(pmax(writing_score, 10), 100))

# --- Assemble data frame ---
semdata <- data.frame(anx1, anx2, anx3,
                      eff1, eff2, eff3,
                      mot1, mot2, mot3,
                      writing_score)

anx1

anx2

anx3

eff1

eff2

eff3

mot1

mot2

mot3

writing_score

1.65878446761365

0.469027026900

0.735936959015

-0.736496636519

-0.3981982863408

0.84344817617125

-0.525614253998

-0.60527718044942

-0.844258969568

45

-0.59604214508311

-0.381178985281

-0.533248219065

0.336402137960

1.1860694486158

0.20202684033663

0.265733073203

0.07063516805699

-0.180535254569

67

0.34361465477057

0.485820949651

-0.301860941882

-0.390199228471

-0.0151391453101

0.06607685831555

0.521537386007

0.37866518494708

0.409302809957

53

0.22208773836151

0.719146394369

1.005445092740

0.183429482000

-0.0863544210207

0.22715556966810

2.006149109041

0.09106874502405

1.401945933434

59

1.67869501106391

0.899380665424

-0.153621051336

-1.498992015891

-0.5347279530011

-0.16440089751639

-1.028867035984

0.24468308399876

-0.174006039530

48

-1.93432079430783

0.571337402425

0.119379154336

-0.135909611175

-0.4297364340749

0.05799587091423

1.281908973310

0.39697806649933

0.966685120800

66

1.22960536100753

-0.597233586401

1.731475490159

0.955109326866

0.4835521704634

0.30625432614125

-0.257846231682

0.00294830748262

0.359349353742

58

-0.00491206812833

1.131028525471

-0.785023833878

1.488172385735

1.0977546064440

0.53822055116506

1.090763554897

0.01821884294643

0.695482255356

77

1.70794208670391

1.769719083129

2.228475356543

0.247495093113

-1.2054221847770

-1.26038826104300

-0.429663213621

-0.29833654950839

0.541155620628

26

-1.02376859717677

-0.440969437781

-1.315232024988

0.132383962316

-0.2531083083651

-0.00406371492397

-1.626932228445

-1.62304091571346

-2.212595203961

41


Conceptual Foundations

Section Overview

What you’ll learn: The core ideas underpinning SEM — latent variables, measurement error, path diagrams, and the two-component structure of a full SEM

Why it matters: SEM notation and vocabulary are quite different from ordinary regression. Building a solid conceptual foundation before fitting models prevents common misinterpretations.

Observed vs. Latent Variables

A fundamental distinction in SEM is between variables you can observe directly and those you cannot.

Observed (manifest) variables are things you actually measure and record: a Likert-scale item, a test score, a reaction time, a corpus frequency count. They appear as columns in your dataset.

Latent variables are theoretical constructs that you cannot measure directly. Language anxiety, motivation, and writing self-efficacy are classic examples from applied linguistics. No single questionnaire item perfectly captures any of these constructs — each item is merely a fallible indicator. Latent variables are never columns in your dataset; instead, they are modelled as common causes of their observed indicators.

This distinction matters because measurement error is unavoidable whenever we use observed items to represent theoretical constructs. If we ignore this error — for example, by averaging questionnaire items and treating the result as if it were the true construct — we introduce bias into our estimates of relationships. SEM addresses this explicitly: it partitions the variance in each observed indicator into a part explained by the underlying latent variable and a part attributed to unique error (random noise plus any systematic variance not shared with the other indicators).

The Two Building Blocks of SEM

A full structural equation model is composed of two sub-models:

Sub-model Technical name What it specifies
Measurement model Confirmatory Factor Analysis (CFA) Which observed items are indicators of which latent variables; how strongly each item loads onto its construct; how much unique error each item has
Structural model Path model Directional relationships among latent variables (and between latent variables and observed outcomes); regression-like paths encoding theoretical predictions

In the standard two-step approach to SEM (Anderson and Gerbing 1988), researchers first establish an adequate measurement model (Step 1) before testing the structural paths of theoretical interest (Step 2). This tutorial follows this workflow: we build and evaluate a CFA in Section 3 and then add structural paths in Section 5.

Path Diagrams

SEM models are almost always communicated visually through path diagrams. The notation is standardised:

Symbol Represents
Rectangle Observed (manifest) variable
Oval / ellipse Latent variable
Single-headed arrow (→) Directional path (a regression-type effect)
Double-headed curved arrow (↔︎) Covariance or correlation
Small arrow into rectangle Residual / unique error for that indicator
Small arrow into oval Disturbance (residual error for an endogenous latent variable)

In a measurement model, ovals point to rectangles: the latent construct is hypothesised to cause variation in its observed indicators. In a structural model, ovals point to other ovals, encoding directional theoretical predictions among constructs.

SEM is a Confirmatory, Theory-Driven Method

Unlike Exploratory Factor Analysis (EFA), which discovers factor structure empirically from the data, SEM requires the researcher to specify the model in advance based on theory. Every path in the diagram — every arrow that is included or excluded — reflects a theoretical decision. A good model fit indicates that the specified model is consistent with the data; it does not prove the model is the only correct one. Alternative models that fit equally well are always possible (this is the problem of equivalent models). Always ground your SEM specifications in theory, not post-hoc data exploration.

A Conceptual Map of Our Example

Our theoretical model for the L2 writing dataset can be described as follows:

  1. Language Anxiety, Writing Self-Efficacy, and Motivation are latent constructs, each measured by three questionnaire items.
  2. We expect Self-Efficacy and Anxiety to have opposite effects on Writing Score: greater self-efficacy should improve performance; greater anxiety should impair it.
  3. Self-Efficacy is also expected to influence Motivation (students who feel more capable tend to be more motivated), and Motivation may in turn have a positive effect on Writing Score. This indirect path constitutes a mediation hypothesis.

This conceptual model drives all the analytic choices that follow.


Descriptive Statistics and Correlations

Section Overview

What you’ll learn: How to examine the observed variables before fitting any model

Key steps: Descriptive statistics, distribution checks, inter-item correlations

Before fitting any SEM, it is good practice to examine the distributions and inter-relationships of your observed variables. Severe non-normality or implausible correlations can signal problems that need to be addressed before modelling.

Descriptive Statistics

Code
psych::describe(semdata) %>%
  dplyr::select(n, mean, sd, median, skew, kurtosis, min, max) %>%
  round(3) %>%
  tibble::rownames_to_column("Variable") %>%
  flextable() %>%
  flextable::set_table_properties(width = .99, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "Descriptive statistics for all observed variables.") %>%
  flextable::border_outer()

Variable

n

mean

sd

median

skew

kurtosis

min

max

anx1

300

-0.018

1.022

0.029

-0.073

-0.301

-2.968

2.660

anx2

300

-0.043

0.973

-0.027

-0.075

0.118

-3.326

2.450

anx3

300

0.016

0.983

0.052

-0.039

0.074

-3.246

2.736

eff1

300

-0.032

1.012

-0.057

0.143

-0.061

-2.916

3.006

eff2

300

-0.002

1.050

-0.012

0.014

-0.184

-2.972

3.887

eff3

300

0.016

0.975

-0.001

0.165

0.303

-2.995

3.133

mot1

300

-0.109

0.983

-0.114

-0.153

-0.057

-2.735

2.438

mot2

300

-0.072

0.964

-0.035

0.187

0.168

-2.470

3.216

mot3

300

-0.071

0.959

-0.065

-0.064

-0.172

-2.668

2.821

writing_score

300

54.560

14.778

54.000

0.036

-0.199

13.000

100.000

All items are centred near zero (as expected for standardised simulated data). Skewness values are within the acceptable range of [−1, +1] for all items, meaning that the normality assumption required for maximum likelihood estimation in lavaan is not substantially violated.

Correlation Matrix

A correlation matrix helps us verify that items within the same scale correlate with each other (convergent evidence) and that items from different scales correlate less strongly (discriminant evidence).

Code
cor_mat <- cor(semdata %>% dplyr::select(-writing_score)) %>%
  round(2)

cor_mat %>%
  as.data.frame() %>%
  tibble::rownames_to_column("Variable") %>%
  flextable() %>%
  flextable::set_table_properties(width = .99, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 10) %>%
  flextable::fontsize(size = 10, part = "header") %>%
  flextable::align_text_col(align = "center") %>%
  flextable::set_caption(caption = "Pearson correlation matrix for the nine questionnaire items.") %>%
  flextable::border_outer()

Variable

anx1

anx2

anx3

eff1

eff2

eff3

mot1

mot2

mot3

anx1

1.00

0.56

0.57

0.03

-0.04

0.00

0.01

-0.03

-0.01

anx2

0.56

1.00

0.56

0.03

-0.07

0.00

0.03

-0.01

-0.01

anx3

0.57

0.56

1.00

0.01

-0.04

-0.01

0.05

-0.02

0.07

eff1

0.03

0.03

0.01

1.00

0.60

0.58

0.30

0.36

0.32

eff2

-0.04

-0.07

-0.04

0.60

1.00

0.53

0.24

0.31

0.27

eff3

0.00

0.00

-0.01

0.58

0.53

1.00

0.23

0.29

0.22

mot1

0.01

0.03

0.05

0.30

0.24

0.23

1.00

0.53

0.50

mot2

-0.03

-0.01

-0.02

0.36

0.31

0.29

0.53

1.00

0.56

mot3

-0.01

-0.01

0.07

0.32

0.27

0.22

0.50

0.56

1.00

Code
# Visualise correlations as a heatmap
cor_long <- cor_mat %>%
  as.data.frame() %>%
  tibble::rownames_to_column("Var1") %>%
  tidyr::pivot_longer(-Var1, names_to = "Var2", values_to = "r")

ggplot(cor_long, aes(x = Var1, y = Var2, fill = r)) +
  geom_tile(color = "white") +
  geom_text(aes(label = round(r, 2)), size = 3.2) +
  scale_fill_gradient2(low = "tomato", mid = "white", high = "steelblue",
                       midpoint = 0, limits = c(-1, 1), name = "r") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        panel.grid = element_blank()) +
  labs(title = "Correlation heatmap: nine questionnaire items",
       x = "", y = "")

The heatmap confirms the expected pattern: items within each scale (e.g., anx1anx3) correlate strongly with each other and more weakly with items from the other scales. The efficacy and motivation items show moderate cross-scale correlations, consistent with our theoretical expectation that the two constructs are related.


Confirmatory Factor Analysis (CFA)

Section Overview

What you’ll learn: How to specify, fit, and evaluate a measurement model using CFA in lavaan

Key concepts: Factor loadings, model fit indices, reliability, convergent and discriminant validity

Why CFA before SEM: The measurement model must be established before structural paths are meaningful. If your indicators do not adequately reflect the intended latent constructs, the structural estimates will be uninterpretable.

What is Confirmatory Factor Analysis?

Confirmatory Factor Analysis (CFA) is a measurement modelling technique in which the researcher specifies in advance which observed variables (indicators) are assumed to reflect which latent factors (constructs), and then tests whether this specification is consistent with the observed data. This is what distinguishes CFA from Exploratory Factor Analysis (EFA): in EFA the factor structure is discovered from the data with no prior constraints; in CFA the factor structure is specified from theory and then confirmed (or disconfirmed) empirically.

In our example, we hypothesise three latent factors:

  • Anxiety (ANX), indicated by anx1, anx2, anx3
  • Self-Efficacy (EFF), indicated by eff1, eff2, eff3
  • Motivation (MOT), indicated by mot1, mot2, mot3

Specifying a CFA Model in lavaan

The lavaan package uses a simple, readable model syntax. The key operator for defining a measurement model is =~ which is read as “is measured by” or “is indicated by”:

LatentVariable =~ indicator1 + indicator2 + indicator3

We specify our three-factor measurement model as follows:

Code
cfa_model <- '
  # Measurement model
  ANX =~ anx1 + anx2 + anx3
  EFF =~ eff1 + eff2 + eff3
  MOT =~ mot1 + mot2 + mot3
'
lavaan Model Syntax at a Glance
Operator Meaning Example
=~ Measured by (latent → indicator) ANX =~ anx1 + anx2
~ Regressed on (structural path) MOT ~ EFF
~~ Correlated with (covariance) ANX ~~ EFF
~1 Intercept / mean anx1 ~ 1

By default, lavaan fixes the first indicator loading to 1.0 to set the scale of each latent variable (the marker variable method), freely estimates the remaining loadings, freely estimates all indicator residuals, and freely estimates all latent variable covariances. You can change these defaults using arguments to cfa() or sem().

Fitting the CFA Model

We fit the model using lavaan::cfa(). The default estimator is Maximum Likelihood (ML), which assumes multivariate normality of the observed variables.

Code
cfa_fit <- lavaan::cfa(cfa_model,
                       data      = semdata,
                       estimator = "ML")

summary(cfa_fit, fit.measures = TRUE, standardized = TRUE)
lavaan 0.6-21 ended normally after 29 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        21

  Number of observations                           300

Model Test User Model:
                                                      
  Test statistic                                12.806
  Degrees of freedom                                24
  P-value (Chi-square)                           0.969

Model Test Baseline Model:

  Test statistic                               856.110
  Degrees of freedom                                36
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000
  Tucker-Lewis Index (TLI)                       1.020

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -3379.871
  Loglikelihood unrestricted model (H1)      -3373.468
                                                      
  Akaike (AIC)                                6801.742
  Bayesian (BIC)                              6879.521
  Sample-size adjusted Bayesian (SABIC)       6812.922

Root Mean Square Error of Approximation:

  RMSEA                                          0.000
  90 Percent confidence interval - lower         0.000
  90 Percent confidence interval - upper         0.000
  P-value H_0: RMSEA <= 0.050                    1.000
  P-value H_0: RMSEA >= 0.080                    0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.023

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  ANX =~                                                                
    anx1              1.000                               0.770    0.754
    anx2              0.937    0.090   10.447    0.000    0.721    0.742
    anx3              0.965    0.092   10.479    0.000    0.743    0.757
  EFF =~                                                                
    eff1              1.000                               0.833    0.824
    eff2              0.923    0.082   11.253    0.000    0.768    0.733
    eff3              0.825    0.075   10.992    0.000    0.687    0.706
  MOT =~                                                                
    mot1              1.000                               0.664    0.677
    mot2              1.134    0.116    9.768    0.000    0.753    0.782
    mot3              1.031    0.108    9.571    0.000    0.685    0.716

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  ANX ~~                                                                
    EFF              -0.004    0.046   -0.095    0.924   -0.007   -0.007
    MOT               0.003    0.038    0.093    0.926    0.007    0.007
  EFF ~~                                                                
    MOT               0.291    0.049    5.901    0.000    0.526    0.526

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .anx1              0.449    0.059    7.658    0.000    0.449    0.431
   .anx2              0.424    0.053    7.990    0.000    0.424    0.449
   .anx3              0.412    0.054    7.583    0.000    0.412    0.427
   .eff1              0.328    0.054    6.081    0.000    0.328    0.321
   .eff2              0.508    0.058    8.695    0.000    0.508    0.463
   .eff3              0.475    0.051    9.277    0.000    0.475    0.502
   .mot1              0.521    0.056    9.266    0.000    0.521    0.542
   .mot2              0.359    0.054    6.701    0.000    0.359    0.388
   .mot3              0.447    0.053    8.464    0.000    0.447    0.488
    ANX               0.592    0.089    6.629    0.000    1.000    1.000
    EFF               0.693    0.092    7.551    0.000    1.000    1.000
    MOT               0.441    0.076    5.835    0.000    1.000    1.000

This output contains three major sections: model fit information, factor loadings (both unstandardised and standardised), and latent variable covariances. We work through each in the sections below.

Interpreting Factor Loadings

Factor loadings express how strongly each indicator is related to its underlying latent variable. In the standardised solution (column Std.all), a loading can be interpreted like a correlation: it represents the expected change in the standardised indicator for a one-standard-deviation increase in the latent variable. Standardised loadings above 0.50 are generally considered acceptable; loadings above 0.70 are considered strong (Hair et al. 2019).

We extract and display the standardised loadings:

Code
loadings_df <- lavaan::standardizedsolution(cfa_fit) %>%
  dplyr::filter(op == "=~") %>%
  dplyr::select(Latent = lhs, Indicator = rhs,
                Std_Loading = est.std, SE = se,
                z = z, p = pvalue) %>%
  dplyr::mutate(across(where(is.numeric), ~round(.x, 3)))

loadings_df %>%
  flextable() %>%
  flextable::set_table_properties(width = .85, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "Standardised CFA factor loadings with standard errors and significance tests.") %>%
  flextable::border_outer()

Latent

Indicator

Std_Loading

SE

z

p

ANX

anx1

0.754

0.038

19.668

0

ANX

anx2

0.742

0.039

19.193

0

ANX

anx3

0.757

0.038

19.772

0

EFF

eff1

0.824

0.033

24.642

0

EFF

eff2

0.733

0.037

19.838

0

EFF

eff3

0.706

0.038

18.458

0

MOT

mot1

0.677

0.042

16.071

0

MOT

mot2

0.782

0.038

20.464

0

MOT

mot3

0.716

0.041

17.666

0

All standardised loadings should exceed 0.50, confirming that each indicator is a meaningful reflection of its intended latent construct.

Model Fit Assessment

Fitting a CFA does not automatically produce a good model. We must evaluate how well the specified model reproduces the observed covariance structure in the data. This is done using model fit indices — statistics that summarise the discrepancy between the model-implied covariance matrix and the observed covariance matrix.

Model Fit Indices: What They Mean and What Cut-offs to Use

No single fit index is sufficient. Report a combination of the following:

Index Full name What it measures Acceptable Good
χ² Chi-square test Overall model misfit (sensitive to N) p > .05 (rarely achieved)
CFI Comparative Fit Index Fit relative to null model ≥ .90 ≥ .95
TLI Tucker–Lewis Index Fit relative to null model (penalises complexity) ≥ .90 ≥ .95
RMSEA Root Mean Square Error of Approximation Average misfit per degree of freedom ≤ .08 ≤ .05
SRMR Standardised Root Mean Square Residual Average standardised residual ≤ .08 ≤ .05

Cut-offs are from Hu and Bentler (1999). Note that these are guidelines, not hard thresholds. Model fit must always be evaluated in the context of model complexity and sample size.

The χ² test is almost always significant in moderate to large samples even for well-fitting models, because it is extremely sensitive to sample size. It is therefore standard practice to rely on the incremental and approximate fit indices (CFI, TLI, RMSEA, SRMR) rather than on χ² alone.

We extract the key fit indices:

Code
fit_indices <- lavaan::fitMeasures(cfa_fit,
                                   c("chisq", "df", "pvalue",
                                     "cfi", "tli",
                                     "rmsea", "rmsea.ci.lower", "rmsea.ci.upper",
                                     "srmr")) %>%
  round(3)

fit_df <- data.frame(
  Index = c("χ²", "df", "p (χ²)", "CFI", "TLI",
            "RMSEA", "RMSEA 90% CI lower", "RMSEA 90% CI upper", "SRMR"),
  Value = as.numeric(fit_indices),
  Threshold = c("—", "—", "> .05", "≥ .95", "≥ .95",
                "≤ .05", "—", "—", "≤ .05")
)

fit_df %>%
  flextable() %>%
  flextable::set_table_properties(width = .70, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "CFA model fit indices with recommended thresholds.") %>%
  flextable::border_outer()

Index

Value

Threshold

χ²

12.806

df

24.000

p (χ²)

0.969

> .05

CFI

1.000

≥ .95

TLI

1.020

≥ .95

RMSEA

0.000

≤ .05

RMSEA 90% CI lower

0.000

RMSEA 90% CI upper

0.000

SRMR

0.023

≤ .05

Internal Consistency Reliability

Beyond model fit, we assess whether each scale is internally consistent — that is, whether the indicators of each latent variable reliably hang together. We use McDonald’s omega (ω), which is the preferred reliability coefficient for factor-based scales because, unlike Cronbach’s alpha, it does not assume equal factor loadings (McDonald 1999).

Code
rel <- semTools::reliability(cfa_fit)

rel_df <- data.frame(
  Scale      = c("ANX (Language Anxiety)",
                 "EFF (Writing Self-Efficacy)",
                 "MOT (Motivation)"),
  Omega      = round(as.numeric(rel["omega", ]), 3),
  Alpha      = round(as.numeric(rel["alpha", ]), 3)
)

rel_df %>%
  flextable() %>%
  flextable::set_table_properties(width = .70, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "McDonald's omega (ω) and Cronbach's alpha (α) for each scale.") %>%
  flextable::border_outer()

Scale

Omega

Alpha

ANX (Language Anxiety)

0.795

0.795

EFF (Writing Self-Efficacy)

0.800

0.798

MOT (Motivation)

0.769

0.769

Values of ω ≥ .70 are generally considered acceptable for research purposes; ω ≥ .80 is considered good (Nunnally 1978).

Visualising the Measurement Model

The semPlot package produces path diagrams directly from a fitted lavaan object.

Code
semPlot::semPaths(
  cfa_fit,
  what       = "std",          # show standardised estimates
  layout     = "tree",
  rotation   = 2,
  edge.label.cex = 0.85,
  sizeMan    = 7,
  sizeLat    = 10,
  color      = list(lat = "steelblue", man = "lightyellow"),
  title      = FALSE,
  style      = "lisrel"
)
title("CFA measurement model — standardised solution", cex.main = 1)

Each oval represents a latent variable; each rectangle an observed indicator. The numbers on the arrows are standardised factor loadings; the numbers on the small arrows into each rectangle are standardised residual variances (unique errors).

Knowledge Check: CFA

Q1. In a CFA path diagram, what does a single-headed arrow from an oval to a rectangle represent?






Q2. A CFA model returns CFI = .88 and RMSEA = .09. What is the most appropriate conclusion?






Q3. What is the main difference between CFA and Exploratory Factor Analysis (EFA)?






Full Structural Equation Model

Section Overview

What you’ll learn: How to extend a CFA measurement model by adding directional structural paths between latent variables and outcomes

Key concepts: Endogenous vs. exogenous variables, structural paths, disturbances, standardised path coefficients

Once we are satisfied with the measurement model, we add the structural paths — the directional hypotheses about how the latent variables relate to each other and to the writing score outcome. Our theoretical model predicts:

  1. AnxietyWriting Score (negative effect: more anxious students perform worse)
  2. Self-EfficacyWriting Score (positive effect)
  3. Self-EfficacyMotivation (positive effect: more efficacious students are more motivated)
  4. MotivationWriting Score (positive effect)

Path (3) combined with path (4) constitutes an indirect effect of Self-Efficacy on Writing Score through Motivation — a mediation hypothesis examined in Section 6.

Specifying the Full SEM

In lavaan, structural paths are specified using the ~ operator, which is read as “is regressed on”:

Outcome ~ Predictor

We combine the measurement model (from the CFA) with the structural paths in a single model string:

Code
sem_model <- '
  # --- Measurement model ---
  ANX =~ anx1 + anx2 + anx3
  EFF =~ eff1 + eff2 + eff3
  MOT =~ mot1 + mot2 + mot3

  # --- Structural paths ---
  MOT           ~ EFF
  writing_score ~ ANX + EFF + MOT
'
Endogenous vs. Exogenous Variables

In SEM terminology:

  • Exogenous variables have no incoming arrows (they are only predictors, never outcomes). In our model, ANX and EFF are exogenous latent variables.
  • Endogenous variables have at least one incoming arrow (they are outcomes of at least one other variable). MOT and writing_score are endogenous.

Endogenous variables have a disturbance (residual error) term — the part of their variance not explained by the variables pointing to them. lavaan estimates disturbances automatically.

Fitting the Full SEM

We fit the full SEM using lavaan::sem(). The syntax is identical to cfa() but with the full model specification:

Code
sem_fit <- lavaan::sem(sem_model,
                       data      = semdata,
                       estimator = "ML")

summary(sem_fit, fit.measures = TRUE, standardized = TRUE)
lavaan 0.6-21 ended normally after 56 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        24

  Number of observations                           300

Model Test User Model:
                                                      
  Test statistic                                16.790
  Degrees of freedom                                31
  P-value (Chi-square)                           0.982

Model Test Baseline Model:

  Test statistic                              1250.761
  Degrees of freedom                                45
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000
  Tucker-Lewis Index (TLI)                       1.017

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -4417.659
  Loglikelihood unrestricted model (H1)      -4409.264
                                                      
  Akaike (AIC)                                8883.317
  Bayesian (BIC)                              8972.208
  Sample-size adjusted Bayesian (SABIC)       8896.094

Root Mean Square Error of Approximation:

  RMSEA                                          0.000
  90 Percent confidence interval - lower         0.000
  90 Percent confidence interval - upper         0.000
  P-value H_0: RMSEA <= 0.050                    1.000
  P-value H_0: RMSEA >= 0.080                    0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.023

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  ANX =~                                                                
    anx1              1.000                               0.765    0.749
    anx2              0.942    0.084   11.160    0.000    0.721    0.742
    anx3              0.978    0.086   11.339    0.000    0.748    0.762
  EFF =~                                                                
    eff1              1.000                               0.809    0.801
    eff2              0.973    0.072   13.596    0.000    0.787    0.751
    eff3              0.859    0.067   12.790    0.000    0.695    0.714
  MOT =~                                                                
    mot1              1.000                               0.672    0.685
    mot2              1.087    0.106   10.269    0.000    0.730    0.759
    mot3              1.044    0.103   10.096    0.000    0.702    0.733

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  MOT ~                                                                 
    EFF               0.435    0.065    6.695    0.000    0.524    0.524
  writing_score ~                                                       
    ANX              -7.257    0.795   -9.126    0.000   -5.550   -0.376
    EFF              12.811    1.001   12.801    0.000   10.365    0.702
    MOT               5.220    1.060    4.926    0.000    3.507    0.238

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  ANX ~~                                                                
    EFF              -0.006    0.044   -0.145    0.885   -0.010   -0.010

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .anx1              0.457    0.054    8.528    0.000    0.457    0.439
   .anx2              0.424    0.049    8.699    0.000    0.424    0.450
   .anx3              0.404    0.049    8.228    0.000    0.404    0.420
   .eff1              0.367    0.040    9.051    0.000    0.367    0.359
   .eff2              0.478    0.048   10.006    0.000    0.478    0.435
   .eff3              0.464    0.044   10.480    0.000    0.464    0.490
   .mot1              0.511    0.054    9.499    0.000    0.511    0.531
   .mot2              0.393    0.049    7.982    0.000    0.393    0.424
   .mot3              0.423    0.049    8.586    0.000    0.423    0.462
   .writing_score    28.004    5.613    4.989    0.000   28.004    0.128
    ANX               0.585    0.086    6.836    0.000    1.000    1.000
    EFF               0.655    0.082    7.938    0.000    1.000    1.000
   .MOT               0.328    0.058    5.651    0.000    0.726    0.726

Structural Path Estimates

We extract the structural paths for a clean summary table:

Code
sem_paths_df <- lavaan::standardizedsolution(sem_fit) %>%
  dplyr::filter(op == "~") %>%
  dplyr::select(Outcome = lhs, Predictor = rhs,
                Std_Estimate = est.std, SE = se,
                z = z, p = pvalue) %>%
  dplyr::mutate(
    across(where(is.numeric), ~round(.x, 3)),
    Sig = dplyr::case_when(
      p < .001 ~ "***",
      p < .01  ~ "**",
      p < .05  ~ "*",
      TRUE     ~ ""
    )
  )

sem_paths_df %>%
  flextable() %>%
  flextable::set_table_properties(width = .90, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "Standardised structural path coefficients from the full SEM.") %>%
  flextable::border_outer()

Outcome

Predictor

Std_Estimate

SE

z

p

Sig

MOT

EFF

0.524

0.058

8.989

0

***

writing_score

ANX

-0.376

0.038

-10.003

0

***

writing_score

EFF

0.702

0.041

17.174

0

***

writing_score

MOT

0.238

0.045

5.234

0

***

Standardised path coefficients can be interpreted similarly to standardised regression coefficients (β): they indicate the expected change in the outcome (in standard deviation units) for a one-standard-deviation increase in the predictor, holding all other predictors constant.

Visualising the Full SEM

Code
semPlot::semPaths(
  sem_fit,
  what       = "std",
  layout     = "tree2",
  rotation   = 2,
  edge.label.cex = 0.80,
  sizeMan    = 6,
  sizeLat    = 10,
  color      = list(lat = "steelblue", man = "lightyellow"),
  title      = FALSE,
  style      = "lisrel",
  residuals  = TRUE,
  curvePivot = TRUE
)
title("Full SEM — standardised solution", cex.main = 1)

R² for Endogenous Variables

We can also inspect how much variance in each endogenous variable is explained by its predictors (the SEM equivalent of R²):

Code
r2_df <- lavaan::inspect(sem_fit, "r2") %>%
  as.data.frame() %>%
  tibble::rownames_to_column("Variable") %>%
  dplyr::rename(R2 = ".") %>%
  dplyr::mutate(R2 = round(R2, 3)) %>%
  dplyr::filter(R2 > 0)

r2_df %>%
  flextable() %>%
  flextable::set_table_properties(width = .45, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "center") %>%
  flextable::set_caption(caption = "Proportion of variance explained (R²) for endogenous variables.") %>%
  flextable::border_outer()

Variable

R2

anx1

0.561

anx2

0.550

anx3

0.580

eff1

0.641

eff2

0.565

eff3

0.510

mot1

0.469

mot2

0.576

mot3

0.538

writing_score

0.872

MOT

0.274

Knowledge Check: Full SEM

Q1. In the lavaan model syntax, what does the ~ operator specify?






Q2. A standardised structural path coefficient of β = −0.42 (p < .001) from Anxiety to Writing Score means:






Mediation Analysis

Section Overview

What you’ll learn: How to test mediation hypotheses — indirect effects of one variable on another via a third — within an SEM framework

Key concepts: Direct effects, indirect effects, total effects, bootstrapped confidence intervals

What is Mediation?

Mediation occurs when the effect of a predictor (X) on an outcome (Y) operates — at least in part — through an intervening variable, the mediator (M). Rather than a simple direct path XY, the effect is transmitted via the chain XMY.

In our example, the theoretical mediation hypothesis is:

Self-Efficacy (EFF) influences Writing Score both directly and indirectly by increasing Motivation (MOT), which in turn improves Writing Score.

This decomposes the total effect of Self-Efficacy on Writing Score into:

  • A direct effect: EFFwriting_score
  • An indirect effect (via Motivation): EFFMOTwriting_score
  • The total effect = direct + indirect

Specifying Mediation in lavaan

lavaan uses labels to name individual paths, which can then be combined using the := operator to define new parameters such as indirect and total effects. Labels are assigned by prefixing a path coefficient with a name followed by *:

Code
mediation_model <- '
  # --- Measurement model ---
  ANX =~ anx1 + anx2 + anx3
  EFF =~ eff1 + eff2 + eff3
  MOT =~ mot1 + mot2 + mot3

  # --- Structural paths (labelled for mediation) ---
  MOT           ~ a * EFF                    # path a: EFF -> MOT
  writing_score ~ b * MOT                    # path b: MOT -> writing_score
  writing_score ~ c * EFF + ANX              # path c: direct effect EFF -> writing_score

  # --- Defined parameters ---
  indirect := a * b                          # indirect effect of EFF via MOT
  total    := c + (a * b)                    # total effect of EFF on writing_score
'

Bootstrapped Confidence Intervals for Indirect Effects

Indirect effects are the product of two path coefficients (a × b). Their sampling distribution is often asymmetric and non-normal, which makes standard errors based on normality assumptions unreliable. The recommended approach is bootstrapping: repeatedly resampling from the data, re-fitting the model, and using the resulting distribution of indirect effect estimates to construct confidence intervals. If the 95% bootstrapped CI does not contain zero, the indirect effect is considered statistically significant.

Code
set.seed(42)
med_fit <- lavaan::sem(mediation_model,
                       data      = semdata,
                       estimator = "ML",
                       se        = "bootstrap",
                       bootstrap = 1000)

# Extract direct, indirect, and total effects
med_effects <- lavaan::parameterEstimates(med_fit, boot.ci.type = "bca.simple") %>%
  dplyr::filter(label %in% c("a", "b", "c", "indirect", "total")) %>%
  dplyr::select(Label = label, Estimate = est, SE = se,
                CI_lower = ci.lower, CI_upper = ci.upper, p = pvalue) %>%
  dplyr::mutate(across(where(is.numeric), ~round(.x, 3)))

med_effects %>%
  flextable() %>%
  flextable::set_table_properties(width = .85, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "Direct, indirect (mediated), and total effects with bootstrapped 95% CIs (1000 resamples).") %>%
  flextable::border_outer()

Label

Estimate

SE

CI_lower

CI_upper

p

a

0.435

0.061

0.330

0.573

0

b

5.220

1.136

2.970

7.599

0

c

12.811

1.170

10.722

15.377

0

indirect

2.270

0.516

1.386

3.423

0

total

15.080

1.072

13.180

17.404

0

Interpreting Mediation Results

To interpret mediation, we examine the following:

  1. Path a (EFFMOT): Is Self-Efficacy a significant predictor of Motivation?
  2. Path b (MOTwriting_score): Is Motivation a significant predictor of Writing Score (controlling for the other predictors)?
  3. Indirect effect (a × b): Is the product significant, as indicated by a 95% CI that excludes zero?
  4. Direct effect c (EFFwriting_score): Does Self-Efficacy still predict Writing Score after accounting for the mediation?

If both the indirect effect is significant and the direct effect remains significant, we have partial mediation: Motivation carries part of the effect of Self-Efficacy to Writing Score, but Self-Efficacy also has an effect above and beyond that mediated path. If the direct effect becomes non-significant while the indirect effect is significant, we have full mediation.

A Note on Causal Language

Mediation analysis is often discussed in causal terms (“X causes Y through M”). However, causal inference from cross-sectional observational data is not straightforward. A statistically significant indirect effect demonstrates that the data are consistent with a mediation mechanism — it does not prove causation. To make stronger causal claims, researchers need longitudinal designs, experimental manipulation of the mediator, or other causal identification strategies.

Knowledge Check: Mediation

Q1. What is the indirect effect in a mediation model?






Q2. Why are bootstrapped confidence intervals preferred over standard (normal-theory) confidence intervals for indirect effects?






Model Comparison and Modification

Section Overview

What you’ll learn: How to compare alternative SEM specifications using formal tests and fit indices, and how to use modification indices responsibly

Key concepts: Nested models, likelihood ratio (chi-square difference) test, AIC/BIC, modification indices

Why Compare Models?

In practice, researchers often have competing theoretical models — alternative specifications that make different predictions about which paths should be present or absent. SEM provides tools for formally comparing such models. There are two situations:

  1. Nested models: Model A is a special case of Model B (Model A is Model B with one or more paths fixed to zero). These can be compared with a chi-square difference test (Δχ²).
  2. Non-nested models: Neither model is a special case of the other. These are compared using information criteria (AIC, BIC): lower values indicate better fit, penalised for model complexity.

Comparing a Constrained Model

Suppose a reviewer argues that the direct path from Self-Efficacy to Writing Score is unnecessary and that all of Self-Efficacy’s influence on Writing Score is mediated through Motivation. We test this by fitting a constrained model with the direct EFFwriting_score path removed, and comparing it to the full model.

Code
constrained_model <- '
  # --- Measurement model ---
  ANX =~ anx1 + anx2 + anx3
  EFF =~ eff1 + eff2 + eff3
  MOT =~ mot1 + mot2 + mot3

  # --- Structural paths (direct EFF -> writing_score path removed) ---
  MOT           ~ EFF
  writing_score ~ ANX + MOT
'

constrained_fit <- lavaan::sem(constrained_model,
                               data      = semdata,
                               estimator = "ML")

# Chi-square difference test
lavaan::lavTestLRT(constrained_fit, sem_fit)

Chi-Squared Difference Test

                Df         AIC         BIC     Chisq  Chisq diff        RMSEA
sem_fit         31 8883.317010 8972.207790  16.78985                         
constrained_fit 32 9003.145145 9088.332142 138.61799 121.8281353 0.6346341079
                Df diff             Pr(>Chisq)    
sem_fit                                           
constrained_fit       1 < 0.000000000000000222 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

A significant Δχ² (p < .05) indicates that the constrained model fits significantly worse — that is, removing the direct path causes a significant deterioration in fit, providing evidence that the direct path contributes meaningfully and should be retained.

We also compare AIC and BIC:

Code
aic_bic <- data.frame(
  Model = c("Full model (with direct EFF path)",
            "Constrained model (no direct EFF path)"),
  AIC   = round(c(AIC(sem_fit), AIC(constrained_fit)), 1),
  BIC   = round(c(BIC(sem_fit), BIC(constrained_fit)), 1)
)

aic_bic %>%
  flextable() %>%
  flextable::set_table_properties(width = .80, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "Model comparison: AIC and BIC for the full and constrained models.") %>%
  flextable::border_outer()

Model

AIC

BIC

Full model (with direct EFF path)

8,883.3

8,972.2

Constrained model (no direct EFF path)

9,003.1

9,088.3

The preferred model has the lower AIC (and lower BIC, which applies a stronger penalty for model complexity). A difference of more than 10 in BIC is generally considered strong evidence in favour of the model with the lower value.

Modification Indices

If a model fits poorly, modification indices (MIs) can help diagnose which additional paths or covariances would most improve fit. Each MI indicates how much the overall model χ² would decrease if a particular currently-fixed parameter were freed (estimated).

Code
mi <- lavaan::modindices(sem_fit, sort. = TRUE, maximum.number = 10)

mi %>%
  dplyr::select(lhs, op, rhs, mi, epc) %>%
  dplyr::mutate(across(c(mi, epc), ~round(.x, 3))) %>%
  dplyr::rename(LHS = lhs, Operator = op, RHS = rhs,
                MI = mi, `Expected Parameter Change` = epc) %>%
  flextable() %>%
  flextable::set_table_properties(width = .85, layout = "autofit") %>%
  flextable::theme_zebra() %>%
  flextable::fontsize(size = 11) %>%
  flextable::fontsize(size = 11, part = "header") %>%
  flextable::align_text_col(align = "left") %>%
  flextable::set_caption(caption = "Top 10 modification indices (sorted by MI, descending). MI > 10 typically warrants attention.") %>%
  flextable::border_outer()

LHS

Operator

RHS

MI

Expected Parameter Change

anx3

~~

mot3

5.008

0.071

ANX

=~

eff2

2.950

-0.119

mot2

~~

writing_score

2.363

-0.616

ANX

=~

eff1

2.327

0.101

eff3

~~

mot3

2.272

-0.047

anx3

~~

mot2

1.681

-0.041

MOT

=~

eff1

1.483

0.125

mot3

~~

writing_score

1.231

0.435

anx2

~~

eff2

1.213

-0.036

eff1

~~

writing_score

1.028

-0.538

Using Modification Indices Responsibly

Modification indices are a double-edged sword. They are useful for diagnosing systematic misfit (e.g., correlated residuals between items that share method variance). However, acting on every high MI and re-fitting the model is a form of capitalising on chance: the revised model will fit the current sample better but may not generalise.

Rules of thumb for responsible use of MIs:

  1. Theory first: only free a parameter if there is a substantive, theoretically defensible reason to do so.
  2. One at a time: modify one parameter, re-fit, re-inspect — do not free multiple parameters simultaneously.
  3. Cross-validate: if sample size permits, split the data and use one half to explore modifications and the other to confirm them.
  4. Report transparently: if modifications were made post-hoc, report this explicitly and distinguish the revised model from the originally hypothesised model.

Knowledge Check: Model Comparison

Q1. What does a significant chi-square difference test (Δχ²) between two nested models indicate?






Q2. A modification index of 24.5 suggests adding a cross-loading of anx2 onto the EFF factor. Should you add this path?






Reporting Standards

Reporting SEM results clearly and completely is as important as the analysis itself. This section summarises conventions for reporting SEM in linguistics and applied linguistics research.


General Principles

What to Report in an SEM Study

Following current best practice (Kline 2023; Jackson, Gillaspy, and Purc-Stephenson 2009):

Model specification

  • The full theoretical rationale for the hypothesised model
  • Which variables are latent vs. observed; which indicators load onto which factors
  • Software and estimator used (e.g., “Models were estimated in R using the lavaan package [version X] with Maximum Likelihood estimation”)

Measurement model (CFA)

  • Standardised factor loadings for all indicators (with SEs and significance)
  • All model fit indices: χ²(df), CFI, TLI, RMSEA (with 90% CI), SRMR
  • Scale reliabilities (McDonald’s ω or Cronbach’s α)

Structural model

  • Standardised path coefficients (with SEs and significance)
  • R² for all endogenous variables
  • Model fit indices

Mediation (if applicable)

  • Labelled paths (a, b, c/c’), indirect effect, total effect
  • Bootstrapped confidence intervals (state N of resamples)
  • Whether partial or full mediation was found

Model comparisons (if applicable)

  • Δχ², Δdf, p-value for nested comparisons
  • AIC/BIC for non-nested comparisons

Model Reporting Paragraphs

CFA

A three-factor measurement model was specified a priori based on the theoretical framework, with Language Anxiety (ANX), Writing Self-Efficacy (EFF), and Motivation (MOT) each indicated by three Likert-scale items (nine indicators in total). The model was estimated using Maximum Likelihood in R (lavaan). Model fit was excellent: χ²(df) = X.XX, CFI = .97, TLI = .96, RMSEA = .04 [90% CI: .01, .07], SRMR = .04. All standardised factor loadings were significant and exceeded 0.70 (range: .71–.82), and McDonald’s ω exceeded .80 for all three scales, indicating good reliability. The measurement model was retained for subsequent structural analysis.

Full SEM

The structural model specified directional effects of Language Anxiety and Writing Self-Efficacy on Writing Score, and an effect of Self-Efficacy on Motivation. Model fit was acceptable: χ²(df) = X.XX, CFI = .96, TLI = .95, RMSEA = .05 [90% CI: .02, .07], SRMR = .05. Writing Self-Efficacy was a significant positive predictor of both Motivation (β = .55, SE = .07, p < .001) and Writing Score (β = .47, SE = .08, p < .001). Language Anxiety was a significant negative predictor of Writing Score (β = −.38, SE = .07, p < .001). Motivation significantly predicted Writing Score (β = .21, SE = .07, p = .003). Together, the predictors explained 58% of the variance in Writing Score.

Mediation

To test whether the effect of Writing Self-Efficacy on Writing Score was partially mediated by Motivation, we re-estimated the model with labelled paths and requested 1000 bootstrap resamples for inference on the indirect effect. The indirect effect of Self-Efficacy on Writing Score via Motivation was significant (unstandardised b = X.XX, 95% BCa CI [X.XX, X.XX]), indicating that part of the positive effect of self-efficacy on writing performance operates through increased motivation. The direct effect of Self-Efficacy on Writing Score remained significant after accounting for this indirect path, supporting partial mediation.


Quick Reference: SEM Workflow

Step

Action

Key R function(s)

1. Theoretical specification

Draw path diagram; specify which indicators load onto which factors and which structural paths are hypothesised

2. Descriptive checks

Examine distributions (skewness, kurtosis), correlations; check for multivariate outliers

psych::describe(); cor()

3. Confirmatory Factor Analysis

Fit measurement model with lavaan::cfa()

lavaan::cfa()

4. Evaluate measurement fit

Inspect CFI, TLI, RMSEA, SRMR against recommended thresholds

lavaan::fitMeasures()

5. Assess reliability

Compute McDonald's ω with semTools::reliability()

semTools::reliability()

6. Full SEM

Add structural paths; fit with lavaan::sem()

lavaan::sem()

7. Mediation (if applicable)

Label paths; define indirect/total effects with ':='; use se = 'bootstrap'

lavaan::sem(se = 'bootstrap')

8. Model comparison

Use lavTestLRT() for nested models; AIC/BIC for non-nested; consult MIs with theory

lavaan::lavTestLRT(); AIC(); modindices()

9. Report

Report all fit indices, standardised loadings, path coefficients, R², and effect CIs

lavaan::standardizedsolution(); parameterEstimates()


Citation & Session Info

Schweinberger, Martin. 2026. Structural Equation Modelling in R. Brisbane: The University of Queensland. url: https://ladal.edu.au/tutorials/sem/sem.html (Version 2026.02.23).

@manual{schweinberger2026sem,
  author       = {Schweinberger, Martin},
  title        = {Structural Equation Modelling in R},
  note         = {tutorials/sem/sem.html},
  year         = {2026},
  organization = {The University of Queensland, School of Languages and Cultures},
  address      = {Brisbane},
  edition      = {2026.02.23}
}
Code
sessionInfo()
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Australia/Brisbane
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] psych_2.4.12     semTools_0.5-8   semPlot_1.1.7    lavaan_0.6-21   
[5] flextable_0.9.11 tidyr_1.3.2      ggplot2_4.0.2    dplyr_1.2.0     
[9] checkdown_0.0.13

loaded via a namespace (and not attached):
  [1] Rdpack_2.6.2            mnormt_2.1.1            pbapply_1.7-2          
  [4] gridExtra_2.3           sandwich_3.1-1          fdrtool_1.2.18         
  [7] rlang_1.1.7             magrittr_2.0.3          multcomp_1.4-28        
 [10] rockchalk_1.8.157       compiler_4.4.2          png_0.1-8              
 [13] systemfonts_1.3.1       vctrs_0.7.1             reshape2_1.4.4         
 [16] OpenMx_2.21.13          quadprog_1.5-8          stringr_1.5.1          
 [19] pkgconfig_2.0.3         fastmap_1.2.0           arm_1.14-4             
 [22] backports_1.5.0         labeling_0.4.3          pbivnorm_0.6.0         
 [25] rmarkdown_2.30          markdown_2.0            nloptr_2.1.1           
 [28] ragg_1.3.3              purrr_1.0.4             xfun_0.56              
 [31] litedown_0.9            kutils_1.73             jsonlite_1.9.0         
 [34] uuid_1.2-1              jpeg_0.1-11             parallel_4.4.2         
 [37] cluster_2.1.6           R6_2.6.1                stringi_1.8.4          
 [40] RColorBrewer_1.1-3      boot_1.3-31             rpart_4.1.23           
 [43] estimability_1.5.1      Rcpp_1.1.1              knitr_1.51             
 [46] zoo_1.8-13              base64enc_0.1-6         Matrix_1.7-2           
 [49] splines_4.4.2           nnet_7.3-19             igraph_2.1.4           
 [52] tidyselect_1.2.1        rstudioapi_0.17.1       abind_1.4-8            
 [55] yaml_2.3.10             codetools_0.2-20        qgraph_1.9.8           
 [58] lattice_0.22-6          tibble_3.2.1            plyr_1.8.9             
 [61] withr_3.0.2             S7_0.2.1                askpass_1.2.1          
 [64] coda_0.19-4.1           evaluate_1.0.3          foreign_0.8-87         
 [67] survival_3.7-0          RcppParallel_5.1.10     zip_2.3.2              
 [70] xml2_1.3.6              pillar_1.10.1           carData_3.0-5          
 [73] checkmate_2.3.2         renv_1.1.7              stats4_4.4.2           
 [76] reformulas_0.4.0        generics_0.1.3          commonmark_2.0.0       
 [79] scales_1.4.0            minqa_1.2.8             gtools_3.9.5           
 [82] xtable_1.8-4            glue_1.8.0              gdtools_0.5.0          
 [85] mi_1.2                  emmeans_1.10.7          Hmisc_5.2-2            
 [88] tools_4.4.2             data.table_1.17.0       lme4_1.1-36            
 [91] openxlsx_4.2.8          mvtnorm_1.3-3           XML_3.99-0.18          
 [94] grid_4.4.2              sem_3.1-16              rbibutils_2.3          
 [97] colorspace_2.1-1        nlme_3.1-166            patchwork_1.3.0        
[100] htmlTable_2.4.3         Formula_1.2-5           cli_3.6.4              
[103] textshaping_1.0.0       officer_0.7.3           fontBitstreamVera_0.1.1
[106] glasso_1.11             corpcor_1.6.10          gtable_0.3.6           
[109] digest_0.6.39           fontquiver_0.2.1        TH.data_1.1-3          
[112] htmlwidgets_1.6.4       farver_2.1.2            htmltools_0.5.9        
[115] lifecycle_1.0.5         lisrelToR_0.3           fontLiberation_0.1.0   
[118] openssl_2.3.2           MASS_7.3-61            

Back to top

Back to HOME


References

Anderson, James C., and David W. Gerbing. 1988. “Structural Equation Modeling in Practice: A Review and Recommended Two-Step Approach.” Psychological Bulletin 103 (3): 411–23.
Hair, Joseph F., William C. Black, Barry J. Babin, and Rolph E. Anderson. 2019. Multivariate Data Analysis. 8th ed. Andover: Cengage.
Hu, Li-tze, and Peter M. Bentler. 1999. “Cutoff Criteria for Fit Indexes in Covariance Structure Analysis: Conventional Criteria Versus New Alternatives.” Structural Equation Modeling: A Multidisciplinary Journal 6 (1): 1–55.
Jackson, Dennis L., J. Arthur Gillaspy, and Rebecca Purc-Stephenson. 2009. “Reporting Practices in Confirmatory Factor Analysis: An Overview and Some Recommendations.” Psychological Methods 14 (1): 6–23.
Kline, Rex B. 2023. Principles and Practice of Structural Equation Modeling. 5th ed. New York: Guilford Press.
McDonald, Roderick P. 1999. Test Theory: A Unified Treatment. Mahwah, NJ: Lawrence Erlbaum Associates.
Nunnally, Jum C. 1978. Psychometric Theory. 2nd ed. New York: McGraw-Hill.

AI Transparency Statement

This tutorial was developed with the assistance of Claude (claude.ai), a large language model created by Anthropic. Claude was used to help draft the tutorial text, structure the instructional content, generate the R code examples, and write the checkdown quiz questions and feedback strings. All content was reviewed, edited, and approved by the author (Martin Schweinberger), who takes full responsibility for the accuracy and pedagogical appropriateness of the material. The use of AI assistance is disclosed here in the interest of transparency and in accordance with emerging best practices for AI-assisted academic content creation.